Towards Domain-Independent Deep Linguistic Processing: Ensuring Portability and Re-Usability of Lexicalised Grammars

نویسندگان

  • Kostadin Cholakov
  • Valia Kordoni
  • Yi Zhang
چکیده

In this paper we illustrate and underline the importance of making detailed linguistic information a central part of the process of automatic acquisition of large-scale lexicons as a means for enhancing robustness and at the same time ensuring maintainability and re-usability of deep lexicalised grammars. Using the error mining techniques proposed in (van Noord, 2004) we show very convincingly that the main hindrance to portability of deep lexicalised grammars to domains other than the ones originally developed in, as well as to robustness of systems using such grammars is low lexical coverage. To this effect, we develop linguistically-driven methods that use detailed morphosyntactic information to automatically enhance the performance of deep lexicalised grammars maintaining at the same time their usually already achieved high linguistic quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual Question Answering With High Portability On Relational Databases

This paper describes a highly-portable multilingual question answering system on multiple relational databases. We apply semantic category and pattern-based grammars, into natural language interfaces to relational databases. Lexico-semantic pattern (LSP) and multi-level grammars achieve portability of languages, domains, and DBMSs. The LSP-based linguistic processing does not require deep analy...

متن کامل

XMG: a Multi-formalism Metagrammatical Framework

In this paper we introduce XMG (eXtensible MetaGrammar), a system dedicated to the production of wide coverage lexicalised grammars. In particular, we show that XMG provides a representation language suitable for describing different linguistic dimensions and different grammatical formalisms. Furthermore, we briefly sketch the architecture of the XMG compiler showing that it encodes a theoretic...

متن کامل

Baldwin, Timothy (2007) Scalable Deep Linguistic Processing: Mind the Lexical Gap, In Proceedings of the 21st Pacific Asia Conference on Language, Information and Computation (PACLIC21), Seoul, Korea, pp. 3-12

Coverage has been a constant thorn in the side of deployed deep linguistic processing applications, largely because of the difficulty in constructing, maintaining and domain-tuning the complex lexicons that they rely on. This paper reviews various strands of research on deep lexical acquisition (DLA), i.e. the (semi-)automatic creation of linguistically-rich language resources, particularly fro...

متن کامل

Trailfinder - A Case Study in Extracting Spatial Information Using Deep Language Processing

The present paper reports on an end-to-end application using a deep processing grammar to extract spatial and temporal information of prepositional and adverbial expressions from running text. The extraction process is based on the full understanding of the input text. It is represented in a formalism standard for unification-based grammars and with a language-independent vocabulary as far as s...

متن کامل

What grammars tell us about corpora : the case of reduced relative clausesPaola

We present a large (65 million words of Wall Street Journal) and in-depth corpus study of a particular syntactic ambiguity to investigate (1) to what extent the structure of a grammar is reeected in a corpus, and (2) how probability functions deened according to a grammar t independently established measures of syntactic disambiguation preference. We look at the well-known case of the ambiguity...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008